首页> 外文OA文献 >Finite mixture model of conditional dependencies modes to cluster categorical data

【2h】

Finite mixture model of conditional dependencies modes to cluster categorical data

机译：条件依赖模式的有限混合模型分类数据

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a parsimonious extension of the classical latent class model tocluster categorical data by relaxing the class conditional independenceassumption. Under this new mixture model, named Conditional Modes Model,variables are grouped into conditionally independent blocks. The correspondingblock distribution is a parsimonious multinomial distribution where the fewfree parameters correspond to the most likely modality crossings, while theremaining probability mass is uniformly spread over the other modalitycrossings. Thus, the proposed model allows to bring out the intra-classdependency between variables and to summarize each class by a fewcharacteristic modality crossings. The model selection is performed via aMetropolis-within-Gibbs sampler to overcome the computational intractability ofthe block structure search. As this approach involves the computation of theintegrated complete-data likelihood, we propose a new method (exact for thecontinuous parameters and approximated for the discrete ones) which avoids thebiases of the \textsc{bic} criterion pointed out by our experiments. Finally,the parameters are only estimated for the best model via an \textsc{em}algorithm. The characteristics of the new model are illustrated on simulateddata and on two biological data sets. These results strengthen the idea thatthis simple model allows to reduce biases involved by the conditionalindependence assumption and gives meaningful parameters. Both applications wereperformed with the R package \texttt{CoModes}

机译：通过放松类的条件独立性假设，我们提出了经典潜在类模型的简约扩展，以聚类分类数据。在名为条件模式模型的新混合模型下，变量被分组为条件独立的块。相应的块分布是简约的多项式分布，其中很少的自由参数对应于最可能的模态交叉，而其余概率质量均匀地分布在其他模态交叉上。因此，所提出的模型允许发现变量之间的类内依赖性，并通过一些特征模态交叉总结每个类。该模型的选择是通过吉布斯大都市采样器进行的，以克服块结构搜索的计算难点。由于此方法涉及对完整完整数据似然性的计算，因此我们提出了一种新方法（对于连续参数精确，对于离散参数近似），避免了我们的实验指出的\ textsc {bic}准则的偏差。最后，仅通过\ textsc {em}算法为最佳模型估计参数。在模拟数据和两个生物学数据集上说明了新模型的特征。这些结果加强了这种简单模型允许减少条件独立性假设所涉及的偏差并给出有意义的参数的想法。这两个应用程序都使用R包\ texttt {CoModes}执行

著录项

作者
Marbac, Matthieu; Biernacki, Christophe; Vandewalle, Vincent;
展开▼
作者单位

展开▼
年度 2014
总页数
原文格式 PDF
正文语种 {"code":"en","name":"English","id":9}
中图分类

相似文献

外文文献
中文文献
专利

1. Latent class model with conditional dependency per modes to cluster categorical data [J] . Marbac Matthieu, Biernacki Christophe, Vandewalle Vincent Advances in data analysis and classification . 2016,第2期

机译：每个模式都具有条件依赖性的潜在类模型将分类数据聚类
2. Model-Based Clustering for Conditionally Correlated Categorical Data [J] . Marbac Matthieu, Biernacki Christophe, Vandewalle Vincent Journal of classification . 2015,第2期

机译：条件相关分类数据的基于模型的聚类
3. Likelihood-based tests for a class of misspecified finite mixture models for ordinal categorical data [J] . Colombi Roberto, Giordano Sabrina Test: An Official Journal of the Spanish Society of Statistics and Operations Research . 2019,第4期

机译：基于可能基于序列分类数据的错过的有限混合模型的测试
4. On Fuzzy Clustering for Categorical Multivariate Data Induced by Polya Mixture Models [C] . Yuchi Kanzawa International conference on modeling decisions for artificial intelligence . 2017

机译：Polya混合模型对分类多元数据的模糊聚类研究
5. Finite mixture models for clustering, dimension reduction and privacy preserving data mining. [D] . Lin, Xiaodong. 2003

机译：用于聚类，降维和隐私保护数据挖掘的有限混合模型。
6. A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data [O] . Xiaofeng Dai, Timo Erkkilä, Olli Yli-Harja, 2009

机译：联合有限混合模型用于根据独立的高斯和Beta分布数据对基因进行聚类
7. Bayesian Clustering of Categorical Time Series Using Finite Mixtures of Markov Chain Models [O] . Frufchwirth-Schnatter Sylvia, Pamminger Christoph 2009

机译：马尔可夫链模型有限混合的分类时间序列的贝叶斯聚类

Finite mixture model of conditional dependencies modes to cluster categorical data

摘要

著录项

相似文献

相关主题

期刊订阅